ON THE LONG TIME BEHAVIOR OF SECOND ORDER DIFFERENTIAL 
EQUATIONS WITH ASYMPTOTICALLY SMALL DISSIPATION 

ALEXANDRE CABOT, HANS ENGLER, AND SEBASTIEN GADAT 

Abstract. We investigate the asymptotic properties as f — > oo of the following 
differential equation in the Hilbert space H 

(S) x(t) + a(t)x(t) + VG(x(f)) = 0, f>0, 

where the map a : R + — > R + is non increasing and the potential G : H — > R is 
of class C . If the coefficient a(t) is constant and positive, we recover the so-called 
"Heavy Ball with Friction" system. On the other hand, when a(t) = l/(f + 1) we 
obtain the trajectories associated to some averaged gradient system. Our analysis 
is mainly based on the existence of some suitable energy function. When the func- 
tion G is convex, the condition a(t) dt = oo guarantees that the energy function 

converges toward its minimum. The more stringent condition e~ So ,! ( s ) ds dt < 
oo is necessary to obtain the convergence of the trajectories of (5) toward some 
minimum point of G. In the one-dimensional setting, a precise description of the 
convergence of solutions is given for a general non-convex function G. We show 
that in this case the set of initial conditions for which solutions converge to a local 
minimum is open and dense. 



1. Introduction 

Throughout this paper, we study the differential equation 

(S) i(f)+/i(f)i(f)+VG(i(t)) = 0, f>0 

in a finite- or infinite-dimensional Hilbert space H, where the map G : H — > 1R is 
at least of class C 1 and a : R+ — > 1R+ is a non increasing function. To motivate our 
study, let us describe four examples and applications which are intimately con- 
nected with equation (S). 

Averaged gradient system For the potential G, the much studied gradient flow is 
defined as the solution map y(0) i— > y(s), s > of the differential equation 

y( S ) = -g(y( S )) = -VG(y(s)). 

It is of interest to consider the case where y(s) is proportional, not to the instanta- 
neous value of VG(y(s)), but to some average of VG(y(r)), t < s. The simplest 
such equation is 

(1) z(s) + i j o S g(z(-r))rfT = 0. 
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For more general gradient systems with memory terms involving kernels, we refer 
for example to [9J. After multiplying equation H) by s and differentiating, this 
leads to the ordinary differential equation 

sz(s) + z(s) +g(z(s)) = 

which becomes 

(2) x(t) + jx(t)+g(x(t)) = 

after the change of variables s = t — 2y/s, x(t) = z ( £ J ,z(s) = x(2 v / s). This 



is the problem (S) with a(t) = I. We note that in the special case where g(£) = £, 
|(2]l is a Bessel equation. All solutions with a finite limit at t = are multiples of 

00 ,2k 

M<) = E(-D'^. 

It is well known that 

/o(0 ~ \/ — cos(f- - 



and therefore 

z( s ) = C ■ /o(2Vs) ~ C ■ Y^cos [isfs - j 

for some suitable constant C as s, t — > 00. Thus the solution z of the averaged sys- 
tem iQ} converges to zero just as the solution y(s) = y(0) e _s of the corresponding 
gradient system does, but it does so much more slowly (at an algebraic rate), and 
it oscillates infinitely often. Our work will generalize this simple famous example 
using several cases for a and G. The case where g(x) = x 5 — x and H = ]R was 
discussed in |T4| . 

Heavy Ball with Friction system A particular attention has been recently devoted 
to the so-called "Heavy Ball with Friction" system 

(HBF) £(t) + yi(t) + VG(x(t)) = 0, 

where 7 > is a positive damping parameter. From a mechanical point of view, 
the (HBF) system corresponds to the equation describing the motion of a material 
point subjected to the conservative force -VG(i) and the viscous friction force 
-71. 

The (HBF) system is dissipative and can be studied in the classical framework of 
the theory of dissipative dynamical systems (cf. Hale [10], Haraux [11]). The pres- 
ence of the inertial term x(t) allows to overcome some drawbacks of the steepest 
descent method. The main interest of the (HBF) system in numerical optimiza- 
tion is that it is not a descent method: it permits to go up and down along the 
graph of G. The trajectories of (HBF) are known to be convergent toward a crit- 
ical point of G under various assumptions like convexity, analyticity, ... In the 
convex setting, the proof of convergence relies on the Opial lemma, see Alvarez 
(21, Attouch-Goudou-Redont 0, while it uses the Lojasiewicz inequality in the 
case of analytic assumptions, (cf. Haraux- Jendoubi [12]). 

In the above (HBF) model, the damping coefficient 7 is constant. A natural exten- 
sion consists in introducing a time-dependent damping coefficient, thus leading to 
the system (S). In our paper, we will focus on the important case corresponding 
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to a vanishing damping term a(t), i.e. a(t) — > as t — > oo. It is clear that the decay 
properties of the map a play a central role in the asymptotic behavior of (<S). In 
particular, if the quantity a(t) tends to too rapidly as t — > oo, convergence of the 
trajectory may fail (think about the extreme case of a = for instance). 

Semilinear Elliptic equations Consider the semilinear elliptic system 

Au(y)+g(u(y)) = 

in R m , where u : R m — > R" is the unknown function. Radial solutions u(y) = 
x(\y\) of this system lead to the ordinary differential equation 

m — 1 

x(r) H — x(r) + g(x(r)) = 0. 

There has been a large amount of work on this problem; see e.g. lTl5l for a recent 
overview. 

Stochastic Approximation algorithms The classical stochastic algorithm introduced 
by fl6ll is used in many fields of approximation theory. This method is frequently 
used to approximate, with a random version of the explicit Euler scheme, the 
behavior of the ordinary differential equation x(t) = —g(x(t)). If we denote 
(X")„ e j\j the random approximations, (co n ) n >i and (t] n ) n >i two auxiliary stochas- 
tic processes, the recursive approximation is generally written as 

f X° G R rf 

* I = X" - £„+i£(X", o;" +1 ) + £, !+ iJ7 n+1 , Vf G N 

where the gain of the algorithm £ w is a sequence of positive real numbers and 7/" is 
a small residual perturbation which is zero in many cases. Defining by (T n ) „ > \ the 
set of measurable events at time t = n, solutions of (|3) are shown to asymptotically 
behave like those of the determinist o.d.e. x(t) = —g{x(t)) provided rj n = o(e n ), 
AM" = g(X n ) — g(X n ,co n+1 ) is the increment of a (local) T n -martingale, and the 
sequence (en)n>i satisfies the baseline assumptions: 

OO CO 

£„ = oo and £}, +a < oo, for some a > 0. 

n=\ n=l 

The assumption on the martingale increment implies that g(X n ) = E [g(X n ,a>" +1 )\J rn ~\ . 
A very common case occurs when (co n ) n> i is a sequence of independent identi- 
cally distributed variables with distribution y. and g(x, .) is j^-integrable: 

Vx g(x) := J g(x,Lo)y(dco). 

This yields the stochastic gradient descent algorithm when g is the gradient op- 
erator of a potential G. One recent application [8] applied this stochastic gradient 
to perform feature selection among a large amount of variables with a simpler Eu- 
ler scheme X" +1 = X n — e n g{X n ,co n+1 ). Further developments have shown that 
in some cases, the random variable g(X n ,.) may have a large variance and the 
stochastic approximation of g(X n ) by g(X n ,co n+1 ) can be numerically improved 



4 



ALEXANDRE CABOT, HANS ENGLER, AND SEBASTIEN GAD AT 



using the following modified recursive definition: 



(4) X" +1 = X" - e„ + i^- 



- i 



One can think about © as a way to improve the instability of the gradient esti- 
mate g(X n ) by an average on the variables {g{X k , a/ +1 ), k < n} whose weights 
correspond to the e n . Actually one can show (see the proof in Appendix A) that 
the limit o.d.e. is given by an equation of type (S): 

(5) x(«) = - *">y , 

for some /3 > 0. In the particular case /5 = 0, we obtain the average gradient sys- 
tem equation (1). 

The analysis of the asymptotic behavior of (S) is based on the use of the en- 
ergy function £ defined by £{t) = j\x(t)\ 2 + G(x(f)) for every t > 0. Under 
convex-like assumptions on G, we prove the convergence of the quantity £(t) to- 
ward min G as t — > oo, provided that J °° a(t) dt = oo. This condition expresses that 
the damping coefficient a(t) slowly tends to as t — > oo. Such a condition has al- 
ready been pointed out for the steepest descent method combined with Tikhonov 
viscosity-regularization in convex minimization (3) as well as for the stabilization 
of nonlinear oscillators [4.7]. When the convex function G has a unique minimum 
x, condition Jq°° «(f) dt = oo is sufficient to ensure the convergence of the trajecto- 
ries of (S) toward x. If the function G has a set of non-isolated equilibria, the more 

stringent condition J °° e~ ^ fl ' s ^ ds dt < oo is necessary to obtain the convergence of 
the trajectories of (S) toward some minimum point of G. Notice that the previous 
condition fails if a(t) = j-py for every t > 0, which shows that the averaged gra- 
dient system defined above is divergent when the convex function G has multiple 
minima. 

We also have substantial results in the non-convex setting, when the function G 
has finitely many critical points. Under the slow condition a{t) dt = oo, we 
then prove that the energy function £(t) converges toward a critical value as 
t —¥ oo. If moreover there exists c > such that a(t) > for every t > 0, 
we show that a Cesaro average of the solution x converges toward some critical 
point of G. Finally, in the one-dimensional setting, a precise description of the 
convergence of solutions is given for a general non-convex function G. We show 
that in this case the set of initial conditions for which solutions converge to a local 
minimum is open and dense. 

Outline of the paper Our work starts with a global existence result of solutions 
to (S), based on the use of the Lyapounov function £ . Section|3]is concerned with 
the asymptotic behavior of the energy function £ under convex-like hypotheses 
on G, and provides estimates on the speed of convergence of the quantity £(t) 
toward inf G as t — > oo. Section H] explores the convergence of the trajectories of 
(S) in the general setting of convex functions having multiple minima. In section 



ON SECOND ORDER DIFFERENTIAL EQUATIONS WITH ASYMPTOTICALLY SMALL DISSIPATION 5 



[5j we study the asymptotic behavior of (<S) in the non-convex case when G has 
finitely many critical points. Finally section [6] is dedicated to the very special one 
dimensional case. Details for the stochastic gradient descent algorithm are given 
in appendix A, and some special equations are discussed in appendix B. 

2. General Facts 

In the entire paper, we will denote by G a C 1 potential map from an Hilbert 
space H into 1R for which the gradient g = VG is Lipschitz continuous, uniformly 
on bounded sets. Given a function a : R + — ► R + , we will consider the following 
dynamical system 

(5) x{t) + a(t)x(t)+g(x{t)) = 0, f>0. 

Let us start with a basic result on existence and uniqueness for solutions of (S). In 
the next statement the map a may have a singularity at t = so as to cover cases 
like a(f) = 1/f, for t > 0. 

Proposition 2.1. (a) Suppose a : (0, oo) — > R + is continuous on (0, oo) and inte- 
grable on (0, 1). Then for any (xq, x\) 6 H x H, there exists a unique solution x(-) 6 
C 2 ([0,T),H) of(S) satisfying x(0) = x , x(0) = x\ on some maximal time interval 
[0,T) c [0,oo). 

(b) Suppose a : (0, oo) — »• R + is continuous and there exists c > such that a(t) < jfor 
t 6 (0,1]. Then for any x$ £ H, there exists a unique solution x(-) 6 C 2 ((0,T),H) (1 
C 1 ([0, T),H) of (<S) satisfying x(0) = xq, x(0) = on some maximal time interval 
[0,T) c [0,oo). 

The previous proposition can be proved with standard arguments for ordinary 
differential equations. The result below states the decay property of the energy 
function S defined by 

(6) S(t)= 1 -\x(t)\ 2 + G(x(t)). 

A global existence result is then derived when the potential function G is bounded 
from below. The existence of the Lyapounov function £ will be a crucial tool for 
the analysis of the asymptotic behavior of (S). 

Proposition 2.2. Let a : R + — ► R + a continuous map and let G : H — > R a function 
of class C 1 such that VG is Lipschitz continuous on the bounded sets of H. Let x be a 
solution to (S) defined on some interval [0, T), with T < oo. 

(a) For every t £ [0, T), the following equality holds 

(7) ^£(t) = -a(t)\x(t)\ 2 
and therefore for < s < t < T 

(8) £(s)- J a(r)\x{T)\ 2 dr = £{t). 

(b) If in addition G is bounded from below on H, then 

(9) f T a{t)\x(t)\ 2 dt < oo 

Jo 
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and the solution exists for all T > 0. 

(c) If also G is coerciv^, then all solutions to (S) remain bounded together with their first 
and second derivatives for all t > 0. The bound depends only on the initial data. 

Proof, (a) Equation (0 follows by taking the scalar product of (S) against x(t), and 
((8) follows by integrating. 

(b) If G is bounded from below, then equality (0 shows that f i— ► £ (f ) is decreasing 
and remains bounded. Estimate © is then a consequence of equality ©, and it 
also follows that sup t<T £(t) < oo. Therefore x is uniformly bounded on [0, T). If 
T < oo, then the solution x together with its derivative has a limit at t = T and 
therefore can be continued. Thus the solution x{t) exists for all t and x is uniformly 
bounded by quantities depending on the initial data. 

(c) Using the coercivity of G and the inequality G(x(f)) <£(£)< £(0), we derive 
that the map x is uniformly bounded. Then also X (f) is uniformly bounded due to 
the differential equation (S) and this bound depends only on the initial data. □ 

If a does not decrease to too rapidly, then the derivative of any solution of (S) 
must be arbitrarily small on arbitrarily long time intervals, infinitely often. 

Proposition 2.3. Let a : R+ — > R+ be a non increasing map such that J °° a(s)ds = oo. 
Let G : H — > R be a coercive function of class C 1 such that VG is Lipschitz continuous 
on the bounded sets of H. Then, any solution x to the differential equation (S) satisfies, 
for every T > 0, 

liminf sup |x(s)|=0. 

f_>0 ° se[t,t+T] 

Proof. Suppose not, then there exist e > 0, T > such that for all k 

SUp \x(s)\ > £ 
se[kT,(k+l)T] 

and thus there are tfc 6 [kT, (k + 1)T] such that |x(fjt)| > £■ Since the map x is 
Lipschitz continuous, there exists some fixed 5 > such that \x(t)\ > e/2 on 
[fjt — 5, ffc + 5} for every k 6 N. Since the map a is non increasing, we have 

E«((fc+l)r)<E«(^)<E^2 / <t)\x(t)\ 2 dt < / a(t)\x(t)\ 2 dt 

Recalling that / °° a{t)\ x(t)\ 2 dt < £(0)-infG < oo, we infer that a(t)dt < oo, 
a contradiction. □ 

We next show that if x is small on some interval, then g(x(t)) is proportionally 
small on a slightly shorter interval. This implies that if solutions slow down for a 
long time interval, they must be near a critical point of G. 

Proposition 2.4. Let a : R + — > R + be a non increasing map. Let G : H — > R be a 

function of class C 1 such that g = VG is Lipschitz continuous on the bounded sets ofH. 



Ifx is a solution of(S) and \x(t)\ < eon [Tq, T%], then for every 5 G 



0, 



we have 



Vf G [T +S,Ti — 8], \g(x(t))\ < + «(T ) + y 
where L > is a Lipschitz constant of the map g on the set x([Tq, T^]). 



^Let us recall that the coercivity of G means that G(£) — » oo as |£| 
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Proof. Suppose \x(t)\ < £ on [To, Tj] with T\ — To > 26. Let the map t 1— > |g(x(t))| 
attain its maximum on [To + 5, T\ — S] at t = to- Since the map g is L-Lipschitz 
continuous on the set x([T , Tij), we have for every f £ [to — 5, t + S] 

\g(x(t )) - g(x(t))\ < L\x(t ) - x(t)\ < Le\t - t\ 

and thus 

l*(0+*(*(*o))| = \a(t)x(t)+g(x(t))-g(x(to))\<a(To)e + L £ \t -t\ 
for the same range of t. We then have 

l*(fo)| 



e>l*(OI > 



> 



x(s)ds 



g(x(t ))ds 



f \x(s) + g(x(t ))\ds 



> 1 1 - t\\g(x(t ))\ - e - a{T )e\t - t\ - -e\t - t\ 



and therefore 



\g(x(h))\ < 



Sett = t ±5 to conclude |g(x(f )) | < (J + a(T Q ) + kfj e. 

By combining Propositions 12.31 and 12.41 we derive the following corollary. 



□ 



Corollary 2.1. Under the assumptions ofProposition \2.3l any solution x to the differential 
equation (S) satisfies, for every T > 0, 

liminf sup \g(x(s))\=0. 

t_>co se[M+T] 

The proof is immediate and left to the reader. 

The last result of this section establishes that, if a solution x to (S) converges 
toward some x G H, then x is a stationary point of VG and moreover the velocity 
x(t) and the acceleration x(t) tend to as t — > oo. 

Proposition 2.5. Let a : R + — »• R + a hounded continuous map and let G : H — > ]R 

foe a function of class C 1 such that VG is Lipschitz continuous on the bounded sets ofH. 
Consider a solution x to (S) and assume that there exists x £ H such that lim^oo x(t) = 
x. Then we have lim^oo x(t) = lim t ^oo x(t) = and the vector x satisfies VG(x) = 0. 

Proof. Since x(-) converges, it is uniformly bounded. Due to the inequality S (t) < 
S(0) for every t > 0, x(-) is also uniformly bounded, and just as in the proof of 
Proposition 12.21 (c), the map x(-) is uniformly bounded as well by some constant 
M > 0. Landau's inequality applied to the map t \—> x(t) — x yields, for every 
t > 



sup |x| < 2 

[*/»[ 



sup \x — x\ . sup I x'l < 2vM /sup \x ■ 

t,oo[ [t,oo[ V [f,oo[ 



By using the assumption limf^oox(f) = x and letting t — ► oo in the above in- 
equality, we derive that limt^oo x(t) = 0. Since lim t _»oo VG(x(t)) = VG(x), the 
differential equation (<S) shows that limf^ooi(t) = — VG(x). If VG(i) 7^ 0, an 
immediate integration gives the equivalence x(t) ~ — t VG(x) as t — > 00, a contra- 
diction. Thus we conclude that VG(x) = and limt^oo x(f) =0. □ 
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3. Case of a convex-like potential. Energy estimates 

As in the previous section, (<S) is studied on a general Hilbert space H in this 
section. Throughout, we will assume that the function G satisfies the following 
condition: there exist z £ argmin G and 9 £ R+ such that 

(10) Vx £ H, G(x) -G(z) < 6{VG(x),x-z). 

This can be viewed as a generalization of the notion of convexity, and it also 
generalizes Euler's identity for homogeneous functions. Indeed, if G is convex, 
then condition (flOb is satisfied with 9 = 1 for every z £ H. Now assume that G is 
defined by G(x) = i (Ax, x) where A £ C(H) is symmetric and positive. We then 
have g(x) = Ax and inequality ftOl holds as an equality with 8 = i and z = 0. 
Finally, if G is defined by G(x) = ^ with p > 1, we have g(x) = x |x|P~ 2 and 
inequality ( TT0T > is satisfied with # = i and z = as an equality. 

3.1. A result of summability. First we give a result of summability of the function 
t i — * £(f) — minG over R, with respect to some measure depending on the map 
a. This property will imply some convergence results on £ provided some weak 
hypotheses on the function a. 

Proposition 3.1. Let a : R + — > R + be a non increasing and differentiable map. Let 
G : H — > R be a coercive function of class C 1 such that VG is Lipschitz continuous on the 
bounded sets of H. Assume that argmin G ^ and that there exist z £ argmin G and 
9 £ R+ such that condition i flOl ) holds. Then, any solution x to the differential equation 
(<S) satisfies the following estimate 




a(t) (£(t)-mmG)dt < oo. 



Proof. Let us define the function h : R + — > R by 

h(t) = a -^\x(t)-z\ 2 +(x(t),x(t)-z). 
By differentiating, we find: 

h(t) = ^ \x(t) - z\ 2 + a(t) (x(t),x(t) - z) + (x(t),x(t) - z) + |i(f)| 2 . 
Since a{t) < 0, we derive that 

h(t) < |x(f)| 2 + (*(*) +a{t)x{t),x{t) -z) 
< \x(t)\ 2 - (g(x(t)),x(t)-z). 

. Recalling that £ (f) = -a(t) \x(t)\ 2 , we find 

£(t)+ma(t) (£(t) -minG) +0ma{t)h(t) = 

(-1 + (9 + 1/2) m) a(t) \x(t)\ 2 + ma(t) [G(x(t)) - minG - 9 (g(x(t)),x(t) - z)] . 
Using condition i flQl and the fact that m < we deduce 

(11) £(f)+ma(O(£(O- minG )+0 mfl ( f )^(O <0. 

Let us integrate the previous inequality on [0, t ] . Since £ (f) > min G, we obtain 

(12) m [ a(s) (£(s) -minG) ds < 5(0) -minG - m f a(s)h(s) ds. 

Jo Jo 



Let us now fix m £ 



0, 



e+T72 
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Then, remark that 

(13) f a(s)h(s) ds = a(t)h(t) - a(0)h(0) - [ a(s)his) ds. 

Jo Jo 

From the decay of the energy function 5, it ensues that f i— > \x(t) | and t 1— ► G(x(t)) 
are bounded. Since the map G is coercive, we infer that the map t t— ► \ x (t)\ is 
bounded. From the expression of h, and the boundedness of 1 i— > 5(f) and thus of 
f w |x(f ) |, we immediately conclude the existence of M > such that \h(t)\ < M 
for every f > 0. We then derive from Jl3l that 



f 

a(s)/z(s) 



< Mfl(0 + Mfl(0) + M / |a(s)|ds 



= Mfl(t) + Ma(0)+M(a(0)-fl(t)) =2Ma(0). 
From (12) . we now have that 

Vf > 0, ml ais) (5(s) -minG) ds < 5(0) -minG + 26mMa(0) 
JO 

and we conclude that J °° «(s) (5(s) — min G) ds < oo. □ 

Now, we can prove the convergence of £ (t) toward min G as t — > oo, provided 
that Jq 00 fl(f) dt = oo. Notice that this assumption amounts to saying that the quan- 
tity a(t) slowly tends to as t — > oo. 

Corollary 3.1. Under the hypotheses ofProposition \3.1\ assume moreover that f™a(t)dt = 
oo. Then\imt^oo£(t) = minG. As a consequence, lim^oo \x(t)\ = and limt^oo G(x(t)) 
min G. 

Proof. Let us argue by contradiction and assume that limf^co £(t) > min G. This 
implies the existence of rj > such that 5(f) — minG > t] for every t > 0. We 
deduce that 

f-OQ P CO 

ait) (5(f) -min G)df > n ait) dt = oo. 

Jo 

This yields a contradiction and we obtain the conclusions that lim^oo |^(f)| = 
and limf-too G(x(t)) = min G. □ 

The next corollary precises the speed of convergence of 5 toward min G under 
some assumption on the decay of f fl(f). 

Corollary 3.2. Under the hypotheses of 'Proposition ^. 1\ assume moreover that there exists 
m > such that a(t) > ml tfor t large enough. Then 

5(f) — minG — o ( — yr | as t — ► oo. 

Proof. Since the functions and 5 are non increasing and positive, it is immediate 
that the map f ait) (5(f) — minG) is also non increasing. In particular, we 
obtain 

r l t 

J a(s) (5(s) -minG)ds > -a(t) (5(f) -minG). 

Since f™a(s) (5(s) — minG)ds < oo, the left member of the above inequality 
tends to as f — ► oo, which implies that lim ta(t) (5(f) — min G) = 0. □ 

t^oo 
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3.2. Case of a unique minimum. In view of the previous results, we are able to in- 
vestigate the question of the convergence of the trajectories in the case of a unique 
minimum. Studies with several minima are more complicated and will be de- 
tailed in section [4] (convex setting), section [5] (non-convex setting) and section [6] 
(one-dimensional case). 

Proposition 3.2. Let a : R + — > R + be a non increasing and differentiable map such that 
Jq 00 a(t) dt = oo. Consider a map a : R + — > R + such that oc(t n ) — > ==>• t„ — > 0/or 
euery sequence (t n ) C R+. Let G : H — > R be a coercive function of class C 1 such that 
VG is Lipschitz continuous on the bounded sets ofH. Given x 6 H, assume that 

(14) VxeH, G(x) > G(x) +a(|x-x|), 

and f/zaf f/zere exzsfs G R+ shc/z f/zaf condition dl0l > /zoZds H?z't7z z = x. T/ien, any 
solution x to the differential equation (S) satisfies lim x(t) = x strongly in H. 

t — >co 

Proof. By applying Corollary 13.11 we obtain limj-joo G(x(t)) = minG = G(x). 
From assumption i fTit , we deduce that limf^oo tx.(\x(t) — x\) = and we finally 
conclude that lim^oo |jc(f) — x\=0. □ 

If the stringent condition l fl4|l is not satisfied, one can nevertheless obtain a re- 
sult of weak convergence, as shown by the following statement. 

Proposition 3.3. Let a : R + — > R + be a non increasing and differentiable map such that 
Jq 00 a{t) dt = oo. Let G : H — > R be a convex coercive function of class C 1 such that VG 
is Lipschitz continuous on the bounded sets of H. If argmin G = {x } for some x £ H, 
then any solution x to the differential equation (S) weakly converges to x in H. 

Proof. Since G is coercive, the trajectory x is bounded. Hence there exist Xco E H 
and a subsequence (tn) tending to oo such that lim n _>oo x(t n ) = Xoo weakly in H. 
Since G is convex and continuous for the strong topology, it is lower semicontinu- 
ous for the weak topology. Hence, we have: 

G(xoo) < liminf G(x(t n )). 

n— »oo 

On the other hand, by applying Corollary l3.11 we obtain limt^oo G(x(t) ) = min G. 
Therefore we deduce that G(xco) < minG, i.e. Xco 6 argminG = {x}. Hence x is 
the unique limit point of the map t x[t) as t — ► oo for the weak topology. It 
ensues that limf_»oo x(t) = x weakly in H. □ 

3.3. Convergence rate of the energy function E . In this paragraph, we will give 
lower and upper bounds for the difference £(t) — inf G as t — ► oo. We start with 
the particular case corresponding to G(x) = |x| 2 /2. In this case, the differential 
equation (S) becomes 

(15) x(t) +a(t)x(t) +x(t) = 0. 

The next proposition precises the rate at which solutions converge to 0. 

Proposition 3.4. Let a : R + — > R + be a non increasing map of class C 2 . Assume that 
lmif_ +0 o a(t) = limt^oo a(t) = and that the map t i— > a(t) + a(t) a(t) has a constant 
sign when t — > oo. Let x be a solution of the differential equation dl5l >. Then there exist 
constants < k < K < oo such that for t large enough 

(16) te-/o«(«)* < |x(f)| 2 + |x(t)| 2 < Ke--!o a ^ ds . 
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Proof. We eliminate the first order term in lfl5t in the usual way: if we set A(t) 
e \ Jo «(s) ds i then the map y defined by y(t ) = A(t )x(t ) satisfies 



(17) 



y(f)+ 1 



2 



8 t 



y(f) = 



for every f > 0. Define the function E : 1R+ — > ]R by 



|y(()r 



for every f > 0. Then E(f) is non-negative for all sufficiently large t, and the 
expression of E(f) as a function of x(t), x(t) is given by 



E (t) = A(tr^[i-^ 

Therefore for sufficiently large t 

Ait)- 1 



a(t) a(t) : 



\x(t)f 



-LZx(i)+*(i) 



(18) 



E(f)<|x(t)| 2 +|x(f)| 2 <2A(f)^E(0. 



Multiplying equation (TTTJ with y(t) results in 

£(0 = -i[s(0 + «(*)«(*)] ly(0l 2 - 

Assume now that a(f ) + a(t) a(t) < for t large enough. Since |y(f) | 2 < 2 E(f) for 
sufficiently large f , we derive that there exists T > such that 

Vf > T, < £(f) < ~[a(t) + a(t)a(t)\ E(t). 

By integrating over [T, (], we obtain 



Vf > T, 



" E(T) " 



fl(s) + 



fl 2 ( S ) 



<fl(T) + 



a 2 (T) 



By setting C = exp ffl(T) + - ^ \ , we then have 

(19) Vf > T, £(T) < E(f) < CE(T). 

Then estimate (Tl6l l follows from | [T8)| and | [T9ll . If we assume that a{t ) + a(t) a(t) > 
for f large enough, the same arguments show that there exist T > and C' 6 (0, 1) 
such that 

Vf>r' ; C'E(T) < E(f) < E(T), 
and we conclude in the same way. □ 

Example 3.1. Assume that a(t) = for every t > 0, with c > 0. It is immediate to 
check that for t large enough, a{t) + a{t) a(t) > (resp. < 0) if c < 2 (resp. c > 2). 
Therefore the assumptions of Proposition ^. 4\ are satisfied and the following estimate holds 
for t large enough 

A< |x(t)| 2 + |*(t)l 2 <|. 
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Example 3.2. Assume that a(t) = f or every t > 0, with a E (0,1). We let the 

reader check that 2(f) + a(t) a{t) < for t large enough. Therefore the assumptions of 
Proposition ^. 4\ are satisfied and the following estimate holds for t large enough 

ke -?-/(i-*) < Wf) |2 + | ±(f) | 2 < Kg-^-VCi-.). 

The result of Proposition [3T4] and Examples 13.11 |3~51 will serve us as a guideline 
in the sequel. Let us now come back to the case of a general potential G. The next 
result provides a lower bound for the convergence rate of the energy function £ . 
We stress the fact that there is no convexity assumption on the function G in the 
next statement. 

Proposition 3.5. Let a : R+ — > R + be a continuous map and let G : H — > R be a 

function of class C 1 such that VG is Lipschitz continuous on the bounded sets of H. If 
inf G > —oo, then any solution x of (S) satisfies 

(20) Vt>0, e(t)-wiG>(£(0)-miG).e- 2 ti a ( s ) ds . 

Proof. Taking into account the expression of £ and the computation of £ , we have 
£(t)+2a(t) (£(*)- inf G) =2a(t) (G(x(t)) - inf G) > 0. 

Let us multiply the above inequality by e 2 fo fl ( s ) ds / we deduce that: 

d r 



Vf > 0, 

dt 



,2 J a(s)ds (£( f )_ mmG ) 



> 0. 



Formula (TSOb immediately follows. □ 

The next corollary gives a first result of non-convergence of the trajectories un- 
der the condition J °°fl(s) ds < oo. This hypothesis means that the quantity a(t) 
fastly tends to as t — ► oo. It is not surprising that convergence fails under such a 
condition, cf. for example the extreme case a = 0. 

Corollary 3.3. Assume that a(s) ds < oo, that the function G is convex, and all the 
other hypotheses of Proposition ^. 51 Given (xq,xq) 6 H 2 , consider the unique solution x 
to the differential equation (S) satisfying the initial conditions (x(0),x(0)) = (xq,xq). 
If(xo,Xo) £ argminG x {0}, then the trajectory x of (<S) does not converge. 

Proof. Let us first remark that the assumption (xq,xq) £ argminG x {0} implies 
that £ (0) > inf G. By taking the limit as t — > oo in inequality (|20|| and recalling that 
Jq°° a(s) ds < oo, we obtain 

(21) lim 5(f)- inf G > {£ (0) - inf G).e~ 2 So< s ) ds > 0. 

t — >00 

Let us now argue by contradiction and assume that there exists x £ H such that 
Iimf_ +O o x(t) = x. From Proposition 12.51 we deduce that limf^oo x(t) = and that 
VG(i) = 0. Since the function G is convex, we infer that x G argminG. It ensues 
that limf^oo £(t) = min G, which contradicts (jZlT l. □ 

The problem of convergence of the trajectories will be considered again in sec- 
tion |U It will be shown that condition J™ a(s) ds = oo is not sufficient to ensure 
convergence. 



We are now going to majorize the map t £(t) — inf G as t — > oo by some 
suitable quantities depending on the function a. 
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Proposition 3.6. Let a : R+ — > R+ be a non increasing and differentiable map. Let 
G : H — > Kbe a coercive function of class C 1 such that VG is Lipschitz continuous on 
the bounded sets of H. Assume that argmin G ^ and that there exist z 6 argmin G 
and 9 6 R+ such that condition (f30l > TzoZds. 

(0 Suppose that there exist K\ > and fj > smc/z t/wf a(f) + K\ a 2 (t) < 0,/or erery 
t>t\. Then, there exists C > such that, for every t > t\, 

(22) £{t) - minG < Ce~ m So fl(s ) rfs , 

with m = min ( „ « /7 , ) . 



V 0+1/2' 

(n) Suppose that there exist K 2 E 0, 9+ ^/ 2 fln< ^ f 2 > smc/z f/zaf a(f ) + K 2 a 2 (t) > 0, 
for every t > t 2 . Then, there exists D > such that, for every t > t 2 , 

(23) 5(f) -minG < Da(t). 

Remark 3.1. It is immediate to see that the assumptions a + K\a 2 < and a + 
K 2 a 2 > imply respectively that a(t) < l/{K\t + Ci) anda(f) > l/(K 2 t + c 2 ), for 
some Cj, c 2 G R. 

Proof. We keep the same notations as in the proof of Proposition 13. 11 in particular 
the expression of the map h is given by 

h{t) = a -^-\x{t)-z\ 2 + {x(t),x{t)-z). 

Let us multiply inequality ((TT) by e m So fl ( s ) ds and integrate on the interval [0, f] : 

(24) e m /o fl ( s ) ds (5(f)- minG) < 5(0) -minG -6m [ F(s) h(s) ds, 



where the function F : R + — > R + is defined by F(s) = a(s) e m So «(«)<*«. The 
function F is differentiable and its first derivative is given by 

(25) F(s) = (a(s) +mfl 2 (s))e m ./o S «(")'i" 

Coming back to inequality l!24ll . an integration by parts yields 

f F(s) h(s) ds = F(f) h(t) - F(0) fc(0) - f F(s) h(s) ds. 

JO JO 
Recalling that \h(t)\ < M for every f > 0, we infer that 



(26) 



f F(s)h(s)ds <M F{t) + F(0)+ t \F(s)\ds 

JO J0 



We now distinguish between the cases (i) and (ii), where the assumptions allow to 
determine the sign of F. 

(i) First assume that there exist K\ > and t\ > such that a(t) + K\ a 2 (t) < 0, for 
every t > t\. Let us take m = min ( q+i/ 2 >Ki) throughout the proof of (i). Since 
m < ^1/ we have a(t) + ma 2 (t) < a(t) + K\ a 2 (t) < for every t > t\. It ensues 
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from 423 that F(f ) < for every £ > f \ . Hence we derive from l|26] | that, for every 

t > h, 



F(s)h(s)ds 



< M 
= M 



F(£)+F(0) 

FO + / 

Jo 



\F(s)\ds- 
F(s)\ds + F(h) 



\F(s)\ds 
Q. 



In view of (24fr . we deduce that, for every £ > t\, 

c «/o«W*(£(t)-ininG) < 5(0)-rninG + QmC x = C. 
Inequality (j22j immediately follows. 



(ii) Now assume that there exist K 2 £ 



0, 



1 



^2 fl2 (£) > 0, for every t > ti. Take any m 6 



9+1/2 

K 2 , 



and £2 > such that a(£) + 



1 

9+T72 



. Since m > K2, we have 



«(£) +ma 2 (t) > a(t) + K 2 a 2 (t) > for every £ > £ 2 . It ensues from (g5) that 
F(£) > for every £ > £2- Hence we derive from ( fS6t that, for every £ > £2, 



F(s)h(s)ds 



< M 
= M 



F(£)+F(0) 



\F(s)\ds- 



\F(s)\ds 



\F(s)\ds 



2MF(t) + C 2 - 



2F(f)+F(0)-F(f 2 )- 
In view of l|24|l , we deduce that, for every £ > £2, 

e m fo"( s ) ds (£(t)-mmG) < £{0) -minG+ 6mC 2 +26mMF(t). 
We then infer the existence of C3 > such that 

e'"/o«(s)ds(£( f )_ minG ) < c 3 F(f), 
which finally implies that £ (f ) — min G < C3 fl(£), for every f > £2- □ 



Let us now comment on the results given by Propositi on 13. 61 Assume that G is 
identically equal to 0. In this case, condition i fTOll trivially holds with 6 = 0. If the 
map a satisfies a + K\ a 1 < for some K\ > 2, Proposition 13 . 61 (i) shows that 

i|x(£)| 2 = £(t) < Ce- 2 /o«( s )*. 

A direct computation shows that x(t) = xq e~ h ds , so that the estimate given by 
Proposition 13.61 (i) is optimal in this case. Now assume that G is given by G(x) = 
\x\ 2 /2 for every x £ H. In this case, condition fTPl holds with 9 = 1/2 and z = 0. 
Suppose that «(£) = -A^ for c G (0,1]. The map a satisfies the inequality a + 
Ki a 2 < for every K x £ (0, 1]. Proposition!^ (i) then shows that £(f) < C/(f + 
l) c for every £ > 0. Example 13.11 allows to check that this estimate is optimal. 
Suppose now that a (f ) = ^ t ^y ror some a £ (0, 1). In this case, the map a satisfies 
the inequality a + K 2 a 2 > for any K 2 > 0. Proposition 13.61 (ii) then gives the 
estimate 5(f) < while in fact 5(f) < Kr' 1_ " / ( 1 ^) by the result of Example 
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4. Convergence of the trajectory. Convex case 

Throughout this section, we are going to investigate the question of the conver- 
gence of the trajectories associated to (S). A first result in this direction is given 
by Corollary 13.31 which states that the trajectories of (S) are not convergent un- 
der the condition J °° a(s) ds < oo (except for stationary solutions). Let us now 
consider the particular case G = 0. The differential equation (S) then becomes 
x(t) + a(t) x(t) = and a double integration immediately shows that its solution 
is given by: 



x(t)=x(0)+±(0) [ e-fi"W du ds. 
Jo 



It ensues that, when G = 0, the solution x converges if and only if the quantity 
Jq°° e~ fo fl (") du ds is finite. Therefore it is natural to ask whether for a general po- 
tential G, the trajectory x is convergent under the condition f °° e~ fo "(") du ds < oo. 
The answer is quite complex and we will start our analysis in the one-dimensional 
case. 

4.1. One-dimensional case. First, we give a general result of non-convergence of 
the trajectories under the condition 

(27) / e~ Jo fl ( s ) ds dt = oo. 

Jo 

Note that it is automatically satisfied if J °° a(s) ds < oo. Condition (|27l l expresses 
that the parametrization a tends to zero rather rapidly. For example, assume that 
the map a is of the form a(t) = c/(t + l) 7 , with 7, c > 0. It is immediate to 
check that condition (|27) is satisfied if and only if (7, c) E (1, 00) x R + or (7, c) 6 
{1} x [0, 1] . Let us now state a preliminary result. 

Lemma 4.1. Let a : R + — > R + be a continuous map such that e~ fa "( s ) ds dt = 00. 

(i) Suppose that the map p 6 C 2 ([fo, 00), IR) satisfies 

(28) Vf>f , p(t)+a(t)p(t) <0. 

Then, we have either lim^oo p(t) = —00 or p(t) > Ofor every t e [to, 00 [. 

(ii) Assume that the map p satisfies moreover: 

(29) Vf>f , p(t) + a(t) p(t) = 0. 

Then, either lim^oo \p(t)\ = 00 or p(t) = p (to) for every t E [fo/°°[- 
Proof, (i) Assume that there exists t\ G [to, 00 [ such that p(t\) < 0. Let us multiply 
inequality j28l by e '0 " s to obtain 

d_ 
di 

Let us integrate the above inequality on the interval [t\, t], for t > t\ 

p(t)<p(h)e-^ )ds . 

By integrating again, we find 

Vt>fi, p(t)<p(h) + p(h) [' e~ J >i a{u)du ds. 

Jh 



f! a(s) ds . , , 



< 0. 
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Recalling that p(t\) < and that e Jo a ( u ) du ds = oo, we conclude that lim t ^oo p(f ) = 
—oo. 

(ii) Assume now that the map p satisfies equality l(29) and that there exists t\ E 
[to, oo [ such that p(t\) ^ 0. The same computation as above shows that 

Vf>ti, p(t) = p(h) + p( tl ) f e~ S k a{u)Au ds. 

Since p(*i) 7^ and the integral f Q °°e ~fo a ( u ") du ds is divergent, we conclude that 
lim^oo p(f) = ±00 (depending on the sign of p{t\)). □ 

Lemma [47T1 is a crucial tool in the proof of the following non-convergence result. 

Proposition 4.1. Let G : IR — > IR be a convex function of class C 1 such that G' is 
Lipschitz continuous on the bounded sets ofR. Assume that argminG = [«,j8], for 
some ol, (6 6 IR such that ol < f>. Let a : IR + — ► M + be a continuous map such 

thai J °° e~~ Jo *( s ) ds df = 00. Given {x<z,x§) 6 IR 2 , consider the unique solution x to 
the differential equation [S) satisfying the initial conditions (x(0),x(0)) = (xq,Xq). If 
(xo,xo) [a,j6] x {0}, the co- limit set co(xo, xq) contains [oc,f$\, hence the trajectory x 
does not converge. 

Proof. Let us assume that 

(30) 3f > 0, Vf > t 0/ x(t) > ol 

and let us prove that this leads to a contradiction. First of all, assertion d30|) implies 
that G'(x(t)) > for every t > to, which in view of (S) entails that 

Vf > t , x(t) +a(t)x(t) < 0. 

Since the map x is bounded, we deduce from Lemma f-Of i) that x(t) > for every 
t > to, hence x := limf_,oox(f) exists. From Proposition 12.51 we have G'(x) = 
0, i.e. x 6 [&,($]. It ensues that x(t) G for every t > to- Hence we have 

G'(x(t)) =0 for every t > to and we infer that 

Vt>t , x(t) +a(t)x(t) = 0. 

Since x( [to, 00 [) C [a, /3], we derive from Lemma [4.1f ii) that x(t) = x(to) for every 
t > to- Thus, it follows by backward uniqueness that we have a stationary solu- 
tion, which contradicts the assumption (xo,xo) [a, jS] x {0}. Hence, we deduce 
that assertion f30l is false, so that we can build a sequence (t n ) tending to 00 such 
that x(t n ) < a. In a symmetric way, we can construct a sequence (u n ) tending to 
00 such that x(u n ) > f>. Recalling that the w-limit set oj{xo,xo) is connected, we 
conclude that w(xo/*o) 3 [ a r D 

We can now wonder if the converse assertion is true: do the trajectories x of (S) 
converge under the condition f °° e~ Jo fl (") du ds < 00? When the coefficient a(t) is 
constant and positive, the trajectories of the so-called "Heavy Ball with Friction" 
system are known to be convergent, see for example |5). The question is more 
delicate in the case of an asymptotically vanishing map a. We mention below a 
first positive result when the map a is of the form a{f) — ttttyy with c, 7 > 0. 
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Proposition 4.2. Let G : R — ► R be a convex function of class C 1 such that G' is 
Lipschitz continuous on the bounded sets of R. Assume that argminG = [«,j6] with 
a < j6 and that there exists S > swcfr that 

V£G(-oo,4 G'(£) <*(£-*) »«* V£€[0,oo), G'(£) > *(£ - 0). 

Green c, 7 > 0, let a : R + — > R + t«e map defined by a{t) = jj^yyfor every t >0. If 
7 G (0, 1) or zf 7 = 1 and c > 1, then for any solution x of (S), lim^oo x(t) exists. 

We omit the proof of this result since it is rather technical and it will be develop- 
ped more widely in a future paper. 

4.2. Multi-dimensional case. Our purpose now is to extend the result of Propo- 
sition [4J] to the case of a dimension greater than one. The situation is much more 
complicated since we have to take into account the geometry of the set argminG. 
In the sequel, we will assume that the gradient of G satisfies the following condi- 
tion: 

(31) Vx G bd (argminG), lim ,^^ X |, exists. 

x— >x, x^argminG |VG(Xj| 

If G is a convex function of a single variable, i.e. H = R, this condition is satisfied 
when the set argmin G is not reduced to a singleton. Before stating the main result 
of non-convergence for the trajectories of (S), let us first recall some basic notions 
of convex analysis. The polar cone K* of a cone K C H is defined by 

K*={yeH| VxeX, {x,y)<0}. 

Let S C H be a closed convex set and let x G S. The normal cone Ns(x) and the 
tangent cone Tg (x) are respectively defined by 

N s (x) = {£gH| VieS, <0} 
T s (x) = cl[U A>0 A(S-x)]. 

The convex cones Ns(x) and T$(x) are polar to each other, i.e. Ns(x) = [Ts(x)]* 
and Ts(x) = [Ns(x)]*. Fur further details relative to convex analysis, the reader is 
referred to Rockaf ellar 's book fl7l . 

The next lemma brings to light a geometrical property that will be crucial in the 
proof of the next theorem. 

Lemma 4.2. Let G : H — > R be a convex function of class C 1 such that S = argmin G ^ 0. 
Given x G bd(S), assume that d := lim x _>^ x VG(x)/| VG(x)| exists. Then we 
have: 

(i) N s (x) =R+ dft 

(ii) There exists a neighborhood V ofx, a closed convex cone K c H along with a positive 
real n > such that: 

{C Y ) Vx G V, VG(x)eK and -KHnBcS-x. 



^When the normal cone N$(x) is reduced to a half -line, the set S is said to be smooth at x. Hence 
item (i) shows that the existence of lim x ^ Ti x gg VG(x) / 1 VG(x) | implies smoothness of the set S at x. 
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Proof, (i) Since the function G is convex and since VG(x) = 0, we have 
(32) VxeH, (VG(x),x-x) > 0. 

Let v £ Tg(x) and take the vector x equal to x + tv, for some f > 0. Remark that, 
since ^ T$(x), we have x + tv ^ S and hence VG(i + tv) ^ for every f > 0. 
From (32) . we derive that 

/ VG(x + tv) \ 
\|VG(x + to)|' / " 

Taking the limit as f — > + , we infer that (d, v) > 0. Since this is true for every 
^ Ts(x), we conclude that — d E [H \ T s (x)]*. Let us denote by H x ,< (resp. 
H x ,>) the closed (resp. open) hyperplane defined by H x ,< = {y E H, {x,y} < 0} 
(resp. H Xl > = {y E H, (x,y) > 0}). The polarity relation T s (x) = [N s (x)]* can 
be equivalently rewritten as T$ (x) = P| Wx,< • Then it follows that 

xeN s {x) 

-dE[H\T S (x)}* = [ u n k,>= n 

xeNs(S) xeN s (x) x£N s (x) 

If the cone Ng(x) is not reduced to a half-line, the above intersection equals {0}, 
which contradicts the fact that \d\ = 1. Hence the cone Ns(x) is equal to a half-line 
and the above inclusion shows that N5 (x) = R+ .d. 

(ii) Since lim^^ x ^ s VG(x)/|VG(x)| = d, there exists a neighbourhood V of x 
such that 

™- *** ~* <*iH>4 

Let us define the cone K = {0} U {v E H, (d, A) > \}. It is clear that K is a 

closed convex cone and that VG(x) 6 K for every x £ V. On the other hand, since 
N$ (x) = R + .d, we have 

liminf (d, > 0. 
Therefore, there exists n > such that 

1 

vEnB and u^S-x => ^']^|^ >_ 2' 

Since for every v E -K \ {0}, (d, A) < -±, we deduce that -X n // B C S - x, 
which achieves the proof of property (C x ) ■ □ 

Let us now state the general result of non-convergence for the trajectories of (S) 
under the condition e~ fa ds dt = 00. 

Theorem 4.1. Let G : H — > R be a convex coercive function of class C 1 such that VG 
z's Lipschitz continuous on the bounded sets of H. Assume moreover that the geometric 

property (T3I) holds. Let a : R+ — > R + be a continuous map such that J™ e~ fo ds dt = 
00. Given (xq,xq) E H 2 , consider the unique solution x of(S) satisfying (x(0),x(0)) = 
(xo,Xo). If (xo,Xq) ^ argminG x {0}, then the trajectory x of(S) does not converge. 
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Proof. For simplicity of notation, let us set S = argminG. Let us prove the contra- 
position of the previous statement and assume that there exists x £ H such that 
limf^oo x(t) = x. We must prove that in this case xq £ S, xg = 0. From Proposition 
12.51 we have VG(i) = 0, hence x £ S. If x £ bd(S), condition (C*) is satisfied 
in view of assumption ((3TJ and Lemma [4.21 (h). On the other hand, if x £ int (S) 
condition (Cy) is trivially satisfied with K — {0}. In both cases, we derive the 
existence of a closed convex cone K C H along with t] > and fg > such that 

(33) Vf > t , VG{x(t))eK and -Ktlr/BcS-x. 

Letv £ —X* and take the scalar product of (<S) by the vectors. Since (VG(x(t)),u) > 
for every t > tg, we deduce that 

Vr>f , («(0/f) +«(*) (*(*)/*) < °- 

Let us apply Lemma I4.HT ) to the map p defined byp(f) = (x(t),v). Since the 
trajectory x is bounded, we infer that (x(t), v) > for every t > ^- By integrating 
on the interval [t,oo[, we find (x(t) — x,v) < for every t > ig. Since this is true 
for every v 6 — X*, we derive that x(t) —x G —X** for every f > tg. Recalling 
that K** = K for every closed convex cone X, we conclude that x(t) — x £ —X for 
every t > fg- O n the other hand, since lim^oo x(t) = x, there exists t\ > fg such 
that x(f) — x 6 B for every t > tj. In view of | |33l , we infer that x(t) £ S for 
every t > t\, so that the differential equation (5) becomes 

By arguing as in the proof of Lemma POT ii), we deduce that either limf^oo |x(f) | = 
oo or i(t) = x(t\) for every t > t\. Since the map x converges toward x, the 
first eventuality does not hold. It follows by backward uniqueness that we have 
a stationary solution, x(t) = xq for all t, which must therefore satisfy (xg, xg) £ 
S x {0}. □ 

5. The case of a Non-Convex Potential 

In this section, we discuss the case where G is defined on R" and has multiple 
critical points, but does not necessarily satisfy condition (10) . Instead, we will 
assume that 

(a) G has finitely many critical points X\, x-i, ■ ■ ■ , Xpj. 

(b) G attains different values on them, i.e. we can order them such that 

Xi = G{x\) < A 2 = G(x 2 ) < A 3 = G(x 3 ) < . . . < A N = G(x N ). 

This is the "generic case". We will also use the assumption 

(c) a : R+ — > R+ is non increasing such that Jq°° a(s)ds = oo . 

Our first result shows that in this case, for each solution there exists exactly one 
critical point that is visited for arbitrarily long times. 

Proposition 5.1. Let G : R" — ► R be a coercive function of class C 1 such that VG 
is Lipschitz continuous on the bounded sets o/R". If the assumptions (a)-(c) above are 
satisfied, then there exists a unique x* £ {xj, x%, ■ ■ ■ , Xjv} such that 

lim £(t) = G(x*) . 
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Also, for all T > 

liminf sup \x(s) — x*\—0. 
f ^°° se[t,t+T] 

Moreover x* is the only point of accumulation that is visited for arbitrarily long time 
intervals. If x* is a local minimum of G, then in fact 

lim \x(t) -x*\=Q. 

t — >co 

Proof. From Proposition 12.31 there are arbitrary long time intervals [S, T] where 
\x(t)\ is arbitrarily small. Since t >—> £(t) = i\x(t)\ 2 + is decreasing and 

bounded below, its limit exists. If this is not equal to G{xf) for some j & {1, . . . , N}, 
then there exists an interval [a,b] that does not contain any critical value of G, 
such that G(x(t)) s [a, b] for every t 6 [S, T]. Now, we can find 5 > such that 
> S on G~ l ([a,b\), and thus |gO(f))l > ^ for every f e [S,T]. However, 
this contradicts Proposition 12.41 Thus, there exists x* £ {x\, Xi, . . . , x^} such that 
limf^co £ (t) = G(x*). Therefore, \ x(t) — x*\ becomes arbitrarily small on arbitrar- 
ily long intervals by Proposition l2.4l To show that there is no other point x** such 
that 

liminf sup \x(s) — x**\ = 0, 
f ^°° s€[M+T] 

firstly suppose that such a point exists with G(x**) = G(x*). Note that then 

g(x**) 7^ 0. On the other hand, \x(t) \ must become arbitrarily small on these same 
intervals, since lirrif^oo £(t) = G(x*) = G(x**), and therefore also g(x(t)) can be 
made arbitrarily small on such intervals. But this impossible, since g(x**) ^ 0. 

Next, suppose that G(x**) < G(x*). In this case, we can find arbitrarily long 
intervals [S, T] where \x(t) — x** \ is small and therefore G(x(t)) < G(x*) — 5 for 
some 5 > 0. Then |x(f) | 2 > 25 on these intervals. However, the map x is uniformly 
bounded on any such interval. By applying Landau's inequality on the interval 

[SrT], 

||^||oo ^ 2 yj ||x X**||oo||x||oo , 

we obtain a contradiction. Hence no such point x** exists. 

Since limf^oo £ (f) = G(x*), the solution x(t) must eventually enter and remain 
in the connected component of G _1 (( — oo, G(x*) + e)) that contains x* , for any 
£ > 0. If * is a local minimum of G (which is strict by assumption), then the 
intersection of these neighborhoods of x* is just {x*}; hence lim^oo x(t) = x* . □ 

We now give a result that shows that the density of the times t E R+ when 
x(t) is near the critical point x* approaches 1. This is comparable to a result on 
"convergence in probability". 

Theorem 5.1. In addition to the hypotheses of Proposition 15.11 we assume that a 6 
Lj oc (K+) and that there exists c > such that a(t) > for every t > 0. Then 
there is a unique stationary point x* 6 {x\, . . . , xjv} of G such that for any e > 

(34) lim T _1 |{f < T\\x{t)-x*\ > e}\ = 0, 

T — >oo 

where \A\ denotes the one-dimensional Lebesgne measure of a measurable set AcR. 
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Proof. We first recall estimate |9| which implies 

fCO 1 

(35) J o j TI \m\ 2 dt<o . 
We are now going to show that 

(36) j o — |g(x(t))| 2 dt<~. 

We write b(t) = Form the scalar product of (S) with b(t)g(x(t)) and integrate 
over [0, T], for some T > 0. The result is the identity 

[ T b(t)\g(x(t))\ 2 dt = - [ T a(t)b(t)(x(t),g(x(t)))dt- [ T b(t)(x(t),g(x(t)))dt. 
Jo Jo Jo 

Integrating by parts, the first integral on the right hand side becomes 

- j\(t)b(t)^G(x(t))dt = (a(t)b(t)G(x(t)))\[Z° T + [ ' j t {a{t)b{t))G{x{t))dt. 

Since G(x(-)) is bounded on [0, oo) and ab + ab is integrable, this term therefore is 
bounded. The second integral on the right hand side of the above identity becomes 
after two integrations by parts 

... = {-b{t){x{t),g{xm)\TJo+ [ h ^^)'S(At)))dt 

= (-b(t)(x(t),g(x(t))) + b(t)G(x(t)))\ t = T - f T b(t)G(x(t))dt 







h C b(t){x(t),Dg(x(t))x(t))dt 



where Dg(x (t ) ) is the derivative of g at x ( t ) . The first two terms are both bounded 
due to previous estimates, and ||Dg(*(f))|| < M for all f since the trajectory x(-) is 
bounded and g = VG is Lipschitz, uniformly on bounded sets. Therefore the last 
integral is bounded in magnitude by 

( b(t)(x{t),Dg(x(t))x(t))dt <Mj o b(t)\x(t)\ 2 dt < ™ ^ a(t)\x(t)\ 2 dt 



which remains uniformly bounded for all T > 0. This proves 

By Proposition l5.1[ limt_»oo £ (t) = G(x*) = A, for some stationary point x* = x, 
of G. Pick 5 > such that A,_i + 5 < A, < A 1+ i — 5, and let To > be so large that 
£(t) < A/ + S if t > To. For the remainder of the proof, we assume without loss of 
generality that Tq = 0. Let £ > 0, then there exists 7 > such that 



(37) |x(t)-x*|>e |3c(0|>V2<J or \g(x(t))\ > 7. 

Indeed, assume that |x(f ) — x* \ > e and that |x(f) | < y26. First we have 

G{x{t)) = £(t)- l -\x{t)\ 2 >\,-5. 

Now set 

7 = min{|g(£) | | |£ - x* | > e, A,- - J < G(£) < A, + <5 } . 
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The quantity 7 is positive since there are no critical points of G in the compact 
region over which \g(-)\ is minimized. Hence assertion (|37t is proved. Therefore, 
we deduce that 



\{t\\x{t) -x*\ > e}\ < \{t\\x(t)\ > V2S or \g(x(t))\ > j}\ 

< \{t\\x(t)\>V2S}\ + \{t\\g(x(t))\> 7 }\. 
By estimates ll35l l and | |36|| combined with Lemma l5Tl we derive that 
lim T^Ht < T\\x(t)-x*\ > e}\ < 

lim T~ l \{t < r||*(f)| > v 7 ^}! + lim r _1 |0 < T\\g(x(t))\ > 7} I = 0. 

T — >oo T — >oo 

The theorem has been proved. □ 

As a consequence, a Cesaro average of the solution x ( ■ ) converges to the critical 
point x*. 

Corollary 5.1. Under the conditions ofTheorem \5.1\ all solutions of (S) satisfy 

1 f T 

lim - / x(t)dt = x* 



- / x(t)dt-x* 
1 Jo 



T->oo T Jo 

for some critical point x* of G. 

Proof. Let x* be the stationary point identified in Theorem 15.1 1 Given e > and 
T > 0, we have 

< ~ [ T \x(t)-x*\dt<e+^\{t<T\\x(t)-x*\>e}\ 
1 Jo 1 

where M = sup f>0 \x(t) — x*\ < 00. The lim sup of the right hand side is no larger 
than £, which proves the corollary. □ 

In the same direction, a result of convergence of the ergodic mean has been 
obtained by Brezis [6| for trajectories associated to semigroups of nonlinear con- 
tractions. 

Let us now establish a result that was useful in the proof of Theorem |5.1| 

Lemma 5.1. Let w : [0, 00) — > M be measurable, bounded, and non-negative. If 

w{t) 



t + 1 



dt < 00, 



then for all e > 



lim T _1 1 < T\w(t) > e}| = 0. 

T— >CO 



Proof. Let e > 0. For < S < T < 00, set A(e, S,T) = {S < t < T\w(t) > e}. It is 
clearly sufficient to show that for all 5 £ (0, 1) we can find S > such that 

limsup(T + l)- 1 |A(e,S,T)| < 5. 

T — >oo 

For this purpose, define p = — log(l — 5), and choose S > large enough such 
that f s °° ^dt < ep. Then for T > S, 

ep > / —-^dt > / dt. 

r ~ Js t + 1 ~ Ja( £ ,S,T) t + 1 
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Clearly the integral on the right is minimized if A(e, S, T) = [T — | A(e, S,T)\, T], 
and therefore 



We stress the fact that the above proof mainly relies on a variant of Markov's 
inequality. 

The main remaining open question is whether lirrtf-,00 x(t) exists if a(t) — > as 
t — > oo. There is a unique stationary point of G that is visited for long times. If 
this point is a local minimum of G, the trajectory will converge to it by Proposition 
15.11 If the point is a local maximum, it appears possible to adapt the arguments 
from the next section to show that again convergence holds. The difficulty is that 
in more than one dimension, the stationary point that is visited for long times may 
be a saddle point of G. Then it is possible that the solution visits other regions of 
R" intermittently for finite amounts of time, infinitely often, spending longer and 
longer periods of time near the saddle point in between. In one dimension, such 
a behavior cannot occur, since solutions either get trapped near local minima of 
G, or if they visit local maxima of G without converging to them, they must leave 
their neighborhood rapidly. 

To end this section, we show that if fl(-) is bounded away from (e.g. if fl(-) is 
a positive constant), solutions of (S) always converge. 

Proposition 5.2. Let a : R + — ► R + be a non increasing map such that a (f) > a^for all 
t > 0, with some > 0. Let G : R" — > R be a coercive function of class C , such that 
VG is Lipschitz continuous on the bounded sets o/R". Assume that G has finitely many 
critical points x\, %i, . . .,Xjv- Then, for any solution x to the differential equation (<S), 
there exists x* £ {x\, xi, . . . , such that lim^oo x(t) = x* . 

Proof. The assumption a(t) > flo > implies that \x(t)\ 2 dt < oo and hence 
limf_>oo x(t) = 0, since the map x is uniformly bounded. From Proposition 12. 41 we 
derive that lim^co g{x(t)) = 0. Since the set of zeroes of g is discrete, this implies 
that x{t) converges to one of the critical points of G. □ 



Let us consider the equation (S) in the one-dimensional case. The derivative x 
changes sign either finitely many times or infinitely many times. In the first case, 
solutions must have a limit, while the second case can occur either if the solution 
approaches a limit or if the <x>-limit of the trajectory is a non-empty interval. We 
shall give conditions that exclude this last possibility. Rather, trajectories always 
have a limit, and moreover solutions oscillate infinitely if and only if this limit is a 
local minimum of G. We show further that in this case the set of initial conditions 
for which solutions converge to a local minimum is open and dense. 

To describe the behavior of the trajectories more precisely, let us write 




By the choice of p, this inequality becomes ^jff^ < S . 



□ 



6. The One-Dimensional Case 



w{t) = £{t) = G{x{t)) + \\x{t)\ 1 
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and observe that 

w(t) = -a(t)\x(t)\ 2 = 2a(t)(G(x(t))-w(t)) 

and 

x(t) = ±yJlw(t)-2G(x(t)). 

Assume that a(t) > for every t > and that the solution x is not stationary. It is 
obvious that w(t) < for f > 0, except at times t where iv(t) = G(x(t)), and these 
t are precisely those times where x changes sign. The set T = {f > | = 0} 
must be discrete. Indeed, if f* is an accumulation point of T, then there exists a 
sequence (£ ; ) tending toward t* such that x(tj) = 0, hence x(t*) = 0. By Rolle's 
Theorem, there exists also a sequence (u-,) tending toward f* such that x(u{) = 0, 
hence x(t*) = 0. Hence we would have the equality x(t*) = x(t*) = G'(x(t*)) = 
and thus x would have to be a constant solution, a contradiction. Therefore there 
exists an increasing sequence (t n ) tending toward oo such that T = {t n , n E N}. 
As T is discrete, w is strictly decreasing, hence lim^oo iv{t) exists if the function 
G is bounded from below. If T is finite, i.e. if x changes its sign finitely often, then 
limf^oo x(t ) = x* exists since x is eventually monotone and is bounded (provided 
some coercivity assumption on G). In this case, G'(x*) = by Proposition 12.51 
However, without additional assumptions on the maps a and G, the trajectory 
x(-) needs not converge, as Proposition 14 . 1 1 shows . 

Before giving the main assumptions of this section, let us recall the definitions 
of strong convexity and strong concavity. 

Definition 6.1. Let G : R — ► R a function of class C 1 and let x* 6 R. The function G is 
said to be strongly convex in the neighbourhood of x* if there exist e, 8 > such that 

(38) Vx,y e)x* -e,x* + e[, G(y) > G(x) + (y - x) G'{x) + 8 (y - x) 2 . 

It is easy to check that the above property amounts to saying that the map x i— > 
G(x) — 8 x 2 is convex on }x* — e, x* + e[. This is also equivalent to the fact that the 
map x i — ► G'(x) — 2 8 x is non decreasing on ]x* — e, x* + e[. When the function G is 
of class C 2 , assertion ll38l l is equivalent to the inequality G" > 2 8 on ]x* — e, x* + e[. 
Let us now introduce the notion of strong concavity. 

Definition 6.2. Let G : R — ► R a function of class C l and let x* 6 R. The function G 
is said to be strongly concave in the neighbourhood of x* if —G is strongly convex in the 
neighbourhood ofx*. 

We are now able to set up the framework that will be used throughout this 
section. The function G : R — > R of class C 1 will satisfy the following assumptions, 
considered as the "generic" case. 

(a) G has finitely many critical points X\ < X^ < ■ ■ ■ < Xjq. 

(b) If k ^ r, then G(x k ) £ G{x r ). 

(c) For all k E {1, . . . ,N}, the function G is either strongly convex or 
strongly concave in the neighbourhood of x^. 

Property (c) implies that the critical points of G correspond either to local minima 
or local maxima of G. Moreover, property (c) shows that near local minima Xj, G 
satisfies the inequality G(x) > G{xj) + 8\x — Xj\ 2 and near local maxima x^, we 
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have similarly G(x) < G(x] c ) — S\x — x^\ 2 . We can now describe the asymptotic 
behavior of solutions of (S). 

Theorem 6.1. Let a : R + — > R + be a differentiable non increasing map such that 
lim f ^oo a(t) = 0. Assume that there exists c > such that a(t) > j^ifar every t > 0. 
Let G : R — > R be a coercive function of class C 1 such that G' is Lipschitz continuous 
on the bounded sets of R. If G satisfies the additional assumptions (a)-(b)-(c) above, then 
for any solution x of (S), limf^oox(t) exists. Moreover, denoting by T the set of sign 
changes of x, the limit is a local maximum of G if and only if the set T is finite, and it is a 
local minimum of G if and only ifT is infinite. 

Proof. We have already observed that if T is finite, then the trajectory must have a 
limit and this limit is a critical point of G by Proposition [23] Let us now show that 
the limit is a local maximum of G. Arguing by contradiction, let us assume that T 
is finite, and that the limit is a local minimum of G. Without loss of generality, we 
may assume that limf_>oo x(t) = and that x is non-increasing for all sufficiently 
large times. Let e > be such that > 2 3^ for £ 6]0, e[ and let T > be such 
that < x{t) < e for t > T. As the map a converges to 0, one can choose X to get 

a(t) < 2 \f~5 for t > T. Set A{t) = exp (\ /„' a(s)ds) and z(f) = A(t)x(t), then 

z(t)+g(t,z(t))=0 

where 



A{t)J 4 * 2 
Recalling that «(f ) < for every f > 0, we obtain g(t, £) > <5£, for < £ < A(t)e. 
We derive that z(f) = —g(t,z(t)) < and z must be concave down. But since the 
function z(f) = A(f)x(f) is also positive for t > T, it must be increasing for all 
t > T, implying z(t) < —5z(t) < —3z(T) for all t > T. This contradicts the fact 
that z remains positive. 

We next consider the case where T is infinite. Without loss of generality, we can 
assume that x(t\) < x(t-2), and since x changes its sign at each t^, one sees that 

*(*2/-l) < x ihj), x{tzj+\) < x ( t 2j) 
for all j '• > 1 . In fact, for all j we have 

x{t 2 j-\) < x(t 2 j+i) < x(t 2 j +2 ) < x(t 2j ) . 

Indeed, if e.g. x(t 2 ; +2 ) > x{t 2 f) for some then there exists s £ {hj+iihj+i) 
(consequently s > t 2 j) satisfying x(s) = x{t 2 j). And thus, since x(t 2 j) = 0, we 
have n;(s) > G(z(s)) = G{x{t 2 ;)) = w(t 2 j), which is a contradiction to the fact 
that w is strictly decreasing. 

Consequently, = Iimy_,oo x(t 2 ; + \) and X 2 = limy-jw, x(t 2 ;) both exist, and 
^1 < ^2- ^ e claim that Xj = X2, which will prove that lim^oo x(t) = X\ = X 2 
exists. This limit must be a critical point of G, by Proposition 12.51 Since we have 
found a sequence (x (fjt))t>o converging to it with G(x(tf c )) > G(Xi) = G(X2),the 
critical point can only be a local minimum, completing the proof of the theorem. 

Suppose therefore Xi < X 2 . Clearly, limt->oo w(t) = G(Xi) = G(X2) since 
we have x{t 2 j) = x(t 2 i + \) = 0. From Proposition 15.11 there exists a critical point 
x* of G such that limt^oo w{t) = G(x*). Since the trajectory does not converge, 
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we deduce from Proposition 15.11 that x* is not a local minimum of G. Thus in 
view of assumption (c), x* is a local maximum of G. Since x* is an accumulation 
point of the trajectory x(.), we have x* 6 [Xi,X 2 ]. Observing that the sequence 
(x(£2;+i));>o converges to Xj with G(x(f 2 j + i)) > G(Xj), the point Xj cannot be a 
local maximum of G, hence x* ^ X\. The same argument shows that x* ^ X 2 , and 
finally x* E (X 1; X 2 ). Since G(X X ) = G(X 2 ) = G(x*), X x and X 2 cannot be critical 
points of G in view of assumption (b). Using the fact that w is non-increasing, 
we have G(£) < G(X X ) = G(X 2 ) for every £ £ [X 1/ X 2 ]. We then deduce that 
G'(Xi) < and G'(X 2 ) > 0. We are now in the situation of Lemma [6. 11 with the 
limit points Xi and X 2 coinciding respectively with the values defined by (f40b and 
<f4Tb - Hence if f,- is sufficiently large, then by inequality d43l l 

t m < U + C + CbxiU + l) 

with some constant C. By induction then 

t„ < C + Cnln(n + 1) 

for a suitable positive constant and for all sufficiently large n, say n > N. These 
estimates imply 

\(t)\x(t)\ 2 dt > £ — c — - f n+1 \m\ 2 dt 

n>N r « + l T J- Jin 



> 



> 



cD 



n>N + 1 



cD 



n>N 



C + C(n + l)ln(«+2) 



where l(42jl in Lemma [67T1 was used to estimate the integrals fl" +1 \ x (t)\ 2 dt from 
below by D. On the other hand, a(t)\x(t) \ 2 dt must be finite. This contradiction 
proves the theorem. □ 

Remark 6.1 (Assumptions on the map a). A careful examination of the above proof 
shows that it is possible to slightly weaken the assumption a{t) > jt^ for every 
t > 0. In fact, if we merely assume that 

a(t Int) dt = 00, 



then the conclusions of Theorem 16.11 still hold true. We let the reader check that 
the map a defined by 1 1— > ln(ln(t+3)) sa ti sr i es the above condition. 

Remark 6.2 (Assumptions on the map G). Assumption (b) can be dropped, at the 
expense of a more technical proof. If assumption (c) is weakened, the current proof 
breaks down. For example, if we merely assume that G(x) < G(x*) — 5 \x — x* \P 
near local maxima x* with some p > 2, then we can only show that t n+ -[ < t n + C + 

C(l + f„) 2 p , and stronger assumptions for a must be made, e.g. a(t) > c(l + t)~ a 

with a ^ I — i J < 1. On the other hand, solutions may converge to local minima 

without oscillating infinitely often, if G is not strongly convex there. For example, 

x(t) = (f + 1)-? is a solution of x(t) + ^^(f) + x{t) 1+ ? = 0, with j6 > and 
n — 1 _i_ a _i_ «-i 
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Remark 6.3. Under the assumptions of Theorem 16. II for any solution of (S) that 
converges to a local maximum x* of G, the set of sign changes T = {t\, . . . , t^} 
is finite. It appears plausible that K and = maxT are bounded in terms of 
£(0) = G(x(0)) + 2|x(0)| 2 , the potential G, and the function a. 

We now show that under the assumptions of Theorem l6.1i solutions generically 
converge to a local minimum of G. 

Theorem 6.2. Under the assumptions ofTheorem \6.1\ the set of initial data (xq, x\) for 
which lmif-^oo x(t) is a local minimum of G is open and dense. 

Proof. For T > 0, define the map Fj : R 2 — > R 2 as Fj{u,v) = (x(F),x(F)), where 
x is the solution of (S) with x(0) = u, x(0) = v. By standard results for ordi- 
nary differential equations, see e.g. [13], Fj is a diffeomorphism and has an in- 
verse F_j. The inverse diffeomorphism maps (u, v) = (x(T), x(F) ) to F-j(u, v) = 
(x(0),x(0)) by solving (S) backwards on [0, Tj. 

Let x be a solution for which lim^oo x(t) = x is a local minimum of G, with 
x(0) = xq and x(0) = x\. We shall find a neighborhood of (xq, x\) such that 
solutions with initial data from this neighborhood have the same limit. There exist 
an open interval I containing x and 5 > such that x is the only minimum of G 
in I and I is one of the connected components of G _1 ([G(x), G(x) + 5)). There is 
a time T > such that x(f) e I and w(t) = G(x(t)) + j\x(t)\ 2 < G(x) + 5 for 
t > T. Consider the open set O = {(u,v) \u £ I, \v\ < ^2<5 + 2G(x) - 2G(u) }. 
By construction this set contains (x(T),x(T)). Any solution y of (<S) with data 
(y(T),y(T)) G C satisfies 

Vf>T G(y(f)) < w(t) <^(T) < G(x)+^. 

Using the definition of 5 and L we conclude that y stays in I for all time greater 
than T. Since I contains only one critical point of G which is a local minimum, we 
infer that y(t) — > x as £ — > oo. Then F_7(C) is an open neighborhood containing 
(xo, xi), and all trajectories with initial data in F_j(C?) also converge to x. 

Next let X be the set of initial data in R 2 whose solutions converge to a local 
maximum of G. We must show that it has empty interior. We first show the fol- 
lowing: 

Claim 6.1. Let xbea solution with initial data (x(0), x(0)) 6 X and let x* be the limit of 
x(f) as t — ► oo. In any neighborhood U o/(x(0),x(0)), there exist (yo/J/i) s«c/z f^fli the 
corresponding solution y of(S) satisfies y* = limf->ooi/(f) 7^ x* and G(y*) < G(x*). 

Let A = min{G(z)|g(z) = 0, G(z) > G(x*)} be the next smallest critical value 
of G. Since G is strictly concave near x* , we can find an interval I = [x* — e, x* + e] 
such that g is strictly decreasing on I. Let F > be such that |x(F) — x*| = £ 
and x does not change sign on [F, 00). After reducing £, we may assume that 
w(T) < A. After replacing x with x* — x and G(£) with G(x* — f ) if necessary, 
we may also assume that x(T) = x* — e < x(f) < x* and x(t) > for t > T. 
We claim that there exists a non-empty open interval / containing x(F) such that 
whenever y is a solution of (S) with y(F) = x(F) andy(F) 6 ],y(T) 7^x(F),then 
y* = lim t _ >oo y(0 7^ x* and G(y*) < G(x*). 
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Indeed, first assume that there exists a solution y 7^ x with y(T) = x(T) that 
converges to x* , such that y(f ) > for t > T. Then v — x — y satisfies 

v(t) + a(t)v(t) +g (y( t )+v{t))-g{y{t)) = 

as well as v(T) = 0, linif^oo v(t) = 0. If v has a positive maximum at some t* > T, 
then v(t*) = and v(t*) < 0, hence g(y(t*) + v(t*)) - g{y{t*)) = g(x(t*)) - 
g{y{t*)) > 0. Butwehaveg(x(f*)) < g{y{t*)) since x* -e < y(f*) < x(t*) < 
x* and since the map g is decreasing on [x* — e,x*], a contradiction. The same 
argument applies if v(t**) is a negative minimum of v. So for any solution y^i 
of (<S) with y(T) = x(T) that converges to x* , the derivative y must have at least 
one change of sign. 

Next let us assume that there is a sequence of solutions y k such that yjt(T) = 
x(T), lim^-^oo y/ c (T) = x(T), and limt_,.ooyfc(0 = x* for all k. By the previous 
argument, the derivatives y k must all change sign at least once on (T, 00). That is, 
for each k there exists some minimal t k > T such yjt(i)t) = 0- Then G{y^{t^)) > 
G(x*) and hence y k {t k ) > x* + e. Let 7\ G (T,f fc ) be such that y k (T k ) = x* + e. 
By Remark IOI especially inequality d48b . we see that T k < t k < T + C + Cln(l + 
T) for all k £ N, for some constant C. By standard results on the continuous 
dependence of solutions of ordinary differential equations on initial data, we have 

(39) Km sup \y k (t)-x(t)\ =0 

for any S > T. Recalling that x(t) < x* for every t > T, we have \y k {T k ) — x(T k )\ > 
e, which contradicts formula d39t applied with S = T + C + C In (1 + T). The 
contradiction shows that for some open interval / containing x(T), solutions y 
of (S) with y(T) = x(T) and y(T) £ /, y(T) 7^ x(T) always have limits y* 7^= 
x*. By shrinking the interval / if necessary, we can guarantee that G(y*) < A 
and therefore also G(y*) < G(x*) for all such solutions. Consider then the set 
F_j({x(r)}x/). It contains initial data arbitrarily close to (x(0), j(0)) whose 
solutions converge to a limit y* with G(y*) < G(x*). This proves Claim \6A\ 

To complete the proof of the theorem, let n > be the number of local maxima 
of G. If G has no local maximum, the set X is empty. Let us now assume that n > 1. 
Let Xq = (x(0), x(0)) £ X and let us denote by x* the limit of the corresponding 

solution x of (S). Let us fix some £ > 0. From Claim [6?fl there exists Yq £ 1R 2 

such that |Yq — Xq| < e/n and such that the limit y' 1 ''* of the corresponding 

solution of (cS) satisfies G(y^'*) < G(x*). If Yq £ X, we can apply again Claim 
16.11 In fact, a repeated application of Claim [Ol shows that there exist k < n along 

with Y (2) , . . . , Y W £ R 2 such that Y Q (,c) ^ X and 

Vi £ {1, . . . , k - 1}, Y (0 £ X and | Y Q (m) - Y Q (0 1 < e/n. 

By summation, we derive that |Yq — Xo| < ^ < £■ Since the existence of such a 

point Yq*^ ^ X is satisfied for every e > and every Xg £ X, we conclude that X 
has empty interior. □ 

Let us finally establish a result that was used several times throughout this sec- 
tion. Consider a coercive function G : K — ► 1R of class C 1 satisfying assumptions 
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(a)-(b) and let x* be a local maximum of G. Let us define X\, X 2 respectively by 

(40) Xi = sup{x < x* I G(x) > G(x*)}, 
and 

(41) X 2 = inf{x > x* I G(x) > G(x*)}. 

The coercivity of G shows that — 00 < X\ < X2 < 00. Since x* is a local maximum, 
it is clear that Xi < x* < X 2 . The continuity of G shows that G(X a ) = G(X 2 ) = 
G (x* ) . In view of assumption (b), this implies that Xi and X 2 are not critical points 
of G. Since G(x) < G(Xi) = G(X 2 ) for every x G [Xi,X 2 ], we then have G^Xa) < 
0andG'(X 2 ) > 0. 



Lemma 6.1. Let a : R + — > K + and G : K — ► ]R foe as in Theorem 16.11 Let x* foe a 
local maximum of G and let X\, X 2 foe the real numbers respectively defined by d40l > and 
dill . Lef x(-) foe a solution of (S) and let T = {tj \ i > 1} foe the set of sign changes 
of x. Assume that for some i > 1, x{t{) < X\ < X 2 < x(t; + i) and G' z's negative on 
[x{ti),Xi\ and positive on [X 2 ,x(f, +1 )]. Then there exist C > 0, D > 0, and Tq > 
(defined in d47l >) f/za£ depend only on G, a, and the initial data such that ifti > Tq, then 

(42) [ t,+1 \x(t)\ 2 dt > D 



(43) fi+i-f. < C + Cln(l + r,). 

These conclusions also hold true in the symmetric situation corresponding to x(f, + i) < 
X x < X 2 < x(t f ). 

Proof. The assumption x(f/) < x(f, + i) implies x > on (f,, ij+i). Since G is 
strongly concave near x*, there exist e, S > such that G(x) < G(x*) — ^|x — 
x*| 2 whenever |x - x*| < e. We can assume that x* = 0, G(x*) = and that 
Xi < — £ < £ < X 2 . Define numbers t,s, s + h, r' 6 (f;, f, + i) by the conditions 
x(t) = X\, x(s) = —£, x(s + h) = s, x(t') = X 2 . We first prove inequality 
Recalling that w (t) > iv(t i+ i) = G(x(f,' +1 )) > for every t £ [f,-, f, +1 ], we have 



./,: 



H \x(t)\ 2 dt = J ' +1 x(t)^2w(t) - 2G{x(t))dt 
^ i(t)y/ '-2G(x(t))dt 



2 J-2G{x)dx = D > . 

We next prove inequality (1431 . We first show that 

(44) s - k < C, 

(45) h< C + Cln(l + f m ), 
and 

(46) f m -(s + /j) <C. 

We claim that we can find c > such that for sufficiently large t ,-, 

w(f) > G(x(f,-)) - c(x(f) - x(t f )) > G(x(t)) 
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for f 6 [ti, t]. Indeed, choose < c < — maxi x u \ x j G'. Then G(x) — G(x(tj)) < 
— c(x — x(f,-)) for x 6 (x(f;),Xi). Now define 



(47) T = inf{s > |«(s) \J2iv(0) - 2minG < c} 



Since w(t) = -a{t)\x{t)\ 2 and |x(f)| < ^2^(0) - 2minG for every t > 0, the 
inequality zb(t) > —c\x(t)\ then holds for all t > Tq. An integration then shows 
immediately that w(t) > G(x(f,)) — c(x(t) — x(f,-)) for t 6 [f,-, tj+i] if f; > Tg. For 
such tj, 

r x(t) , /- t *(f) 



i i(t) Ju y/2w(t) - 2G(x(*)) 

/ 

t< V 2 G(x(f,-)) - 2c(x(f) - x(t,-)) - 2G(x(f)) 
x i dx 



/*(*;) ^2G(x(t,-)) - 2c{x - x(ti)) - 2G(x) 

The term under the square root is equivalent to — 2(G'(x(f,)) + c) (x — x(t{)) as 
x x(tj) and it ensues that the above integral is convergent, due to the choice of 
c. Therefore, we derive that t — tj < C\ for some constant C\ that depends on G 
and c. We next estimate s — t. Note that by construction, G{x) < on (Xj, — e] 

and G'fXi ) < 0. Hence x >—> . 1 is inteerable on fXi, — el. Then as before 

v ' \/-G(x) 6 



V2z<K0-2G(x(f)) 



/ - * (0 it 

t V"2G(x(0) 

: dx 



/Xi y/-2G(x) 

= c 2 

for another constant C 2 . These two estimates imply (|44l . Note that these two 
estimates do not use any information about G on [— e, x(f; + i)], and they hold also 
if in fact x(-) is monotone on [f oo) and converges to x* . That is, 

s = inf{f > tj\x(t) > -e} < ti + C 

for some C that depends only on G and fl. 

Next, let us show l l46l l, employing the same argument that was used to show 



f ti+1 x(t) 
t i+1 -(s + h) = / v ; dt 

Js+h y/2w(t) - 2G{x{t)) 

< f tm , *<*> 

" Js+fc V2G(x(f /+1 ))-2G(x(0) 

*(**+!) rfx 



v/2G(x(f, +1 ))-2G(x) 



C 3 



where C3 depends on G and e. 
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We now show j45l l. Recall that G(x) < — S\x\ 2 on the interval [—£,£] and that 
w(t) > w(s + h) for every t £ [s,s + h]. Then 



h 



r 



x(t) 



dt 



< 



y/2w(t) - 2G(x(t)) 
s + h x(t) 

y/2w(s + h) + 2S\x(t)\ 2 ' 
1 dx 
e y/2zv (s + h) +2Sx 2 

In (x- 

< C4 — C5 In if (s + h), 



-,dt 



w(s + h) 



for suitable constants C4, C5 that depend on £ and S. Let us estimate the quantity 

w(s + h): 



zv(s + h) 



> 



w(t i+1 ) 4 

U+i c 



s+h 



a(t)\x(t)\ 2 dt 



-\x(t)\ z dt > 

s+h t + V yn ~ t i+1 + lj 5+h 

By arguing as in the proof of inequality l l42l l, it is immediate to check that 

■i+i . . . .n r x 2 



\x(t)\ 2 dt. 



s+h 



\x(t)\ z dt > 



-2G(x)dx = D' > 0. 



It ensues that w(s + h) > t c ^ + \ an d consequently 

h < C + Chx(t i+1 + 1) 
for some C > 0. Combining l|44)l , ||45)| and | |46)| results in 

ti+i-U < C + Cln(l + f i+ i) 
and therefore by an elementary argument 

ti+l-tj < C + Cln(l + f ; ) 
with some new constant C > 0. This proves d43t completely. 

Remark 6.4. By combining d45t and d46t , we obtain 

t i+ i-s< C + Cln(l + f ;+ i) 
and therefore by the same argument as above 
(48) U+1-b < C + Cln(l + s), 

with some new constant C > 0. Note that the proof of j48l l does not use any 
assumptions about the behavior of x(-) for t < s, which allows us to use it in the 
proof of Theorem |6.2| 



□ 
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Appendix A: Stochastic approximation of (S) 

We show in this appendix how the stochastic approximation scheme defined in 
the introduction by ^ naturally yields an ordinary differential equation of type 
(<S). Using the same notations as in the introductory paragraph dealing with the 
stochastic approximation, we define h n+1 as the average drift at step n by 



n+l _ i^O 

n 

1=0 



and we set t" = £q + e\ + ■ ■ ■ + e„. Note that h 1 is thus initialized as g(X°,co 1 ). 
We can then rewrite (|4) in 

( {X°,T°,h ) G R rf x {e } x {0} 



(49) 



T n+l = T n + £n+i VkGN 



TT T 

k X" +1 = X" - £„+ifr" +1 Vw G N. 
The recursion d49l can be identified now as 

(T n+1 / fc n+1 / X n+1 ) = (T n ,/i",X")-£„ +1 H(T",/z",X",a; n+1 )+£„ +1 77" +1 

where H{T,h,X,co) = ( — l,(h — g(X,to))/r,h) and the residual perturbation j?"" 1 " 1 
is given by 

,n+l 



7 



u / l £ n+l ~~ £ n) — / £» 



T £ n +l 



Remark then that ?/"— >0as«— >oo and the conditional expectation with respect 
to filtration T n of H(t", ft", X", o;" +1 ) is 



H(T",/z",X",a;" +1 )|J-J = ( -1,— — ^-±,h n 



E 



Applying the result of [16], the time interpolation of the process (x n ,h n ,X n ) n >\ 
asymptotically behaves as the solution of the following system of differential equa- 
tions 

' f(f) = 1 Vf G R 



(50) 



fc(f) = - »W-f(*(0) WgR 



^ X(f) = -fe(f) Vf G R. 
If we note /3 = t(0), we now have 

xm = -h(t) = M0-g(x(Q) = x(f) + g(x(Q) 

which is a particular case of (5). 
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Appendix B: Special Cases 
Consider first the equation 

(51) x{t)+ C -x{t)+x{t) =0 
for t > and its shifted version 

(52) je (t) + _£_ i ( t ) + 3C ( f ) = o. 
Let x be a solution of the Bessel equation 

t 2 x(t) + tx(t) + (t 2 - (^)j *(*) = 

for t > 0, i.e. 

x(t) = hl( c -i)/2(t) + b 2 y {c -i)/2(t) 
where ]( c —\)/2 anc ^ ^(c-l)/2 are Bessel functions of the first and second kind. A 
calculation shows that 

x(t) = t^x(t) 

is a solution of d5H . For x(0) to be finite, we require b 2 = 0. Hence the general 
solution of <(5TJ with finite x(0) is 

x c (t) = bi^/( c _i)/ 2 (0- 

Since J^ c -i)/2^) ~ cos (* ~~ T") as ' ~~ * 00 ^ or a ^ C/ we therefore see that 

x c (t) ^Cr c/2 cos(f-^) 

as f — > 00, for all f, with a suitable constant C. 
Solutions of l l52"t are of the form 

x(t) =h(t + l)^/ (c -i)/ 2 (f + 1) + hit + l)^Y( c -i)/ 2 (< + 1) 

and have the asymptotic behavior 

x(t) « r c/2 (Ccos(f- <p )) 

with a suitable amplitude constant C and phase shift cpQ. This is the typical behav- 
ior of solutions of (S) in one dimension near non-degenerate local minima of G, in 
the case a(t) = cr 1 or a(t) = c(t + 

Consider now the equation 

(53) y{t) + - t y{t)-y{t) = Q 
for t > and its shifted version 

(54) y(0 + y^72/(0-y(0 = o. 

We are interested in solutions y c that converge to 0. Let y be a solution of the 
modified Bessel equation 

ty(t) + ty(t)-(t 2 +(^±y\ y(t) = 
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for t > 0, i.e. 

?(f) = M( c -l)/2(')+M^-l)/2(0 
where If c -l)/2 aR d ^(c-l)/2 are modified Bessel functions of the first and second 
kind. A direct calculation shows again that 

y c {t) = t^y(t) 

is a solution of j53|l . For y c to be convergent to 0, we require b\ = 0, since l v (f ) — »• oo 
as t — > oo. Thus solutions of d53t that converge to zero are of the form 

y c (t) = b 2 t 1 2 £ K {c _ 1)/2 (t). 

Since K(c-i) /zif) ~ \J~~kt e ~ t ^ or a ^ c as * ~~ * 00 w ith higher order terms depending 
on c, we see that 

y c (t) « cr c /V f 

as f — > oo, with some constant C. Solutions of ||54|| that converge to then are of 
the form 

y(t) = b 2 (t + l)^K { ,_ 1)/2 (t + l) 

and have the same asymptotic behavior. This is the typical behavior of solutions 
of (S) in one dimension near non-degenerate local maxima of G, again in the case 
fl(f) = ct" 1 or a(f) = c(t + The standard reference for results on Bessel 

functions is (TJ. 
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